Large-Scale Approximate Kernel Canonical Correlation Analysis
نویسندگان
چکیده
Kernel Canonical correlation analysis (KCCA) is a fundamental method with broad applicability in statistics and machine learning. Although there exist closedform solution to the KCCA objective by solving an N × N eigenvalue system where N is the training set size, the computational requirements of this approach in both memory and time prohibit its usage in the large scale. Various approximation techniques have been developed for KCCA. A recently proposed approach is to first transform original inputs to a M -dimensional feature space using random kitchen sinks so that inner product in the feature space approximates the kernel function, and then apply linear CCA to the transformed inputs. In challenging applications, however, the dimensionality M of the feature space may need to be very large in order to reveal the nonlinear correlations, and then it becomes non-trivial to solve linear CCA for data matrices of very high dimensionality. We propose to use the recently proposed stochastic optimization algorithm for linear CCA and its neural-network extension to further alleviate the computation requirements of approximate KCCA. This approach allows us to run approximate KCCA on a speech dataset with 1.4 million training samples and random feature space of dimensionality M = 100000 on a normal workstation.
منابع مشابه
On Column Selection in Approximate Kernel Canonical Correlation Analysis
We study the problem of column selection in large-scale kernel canonical correlation analysis (KCCA) using the Nyström approximation, where one approximates two positive semi-definite kernel matrices using “landmark” points from the training set. When building low-rank kernel approximations in KCCA, previous work mostly samples the landmarks uniformly at random from the training set. We propose...
متن کاملMulti-View Canonical Correlation Analysis
Canonical correlation analysis (CCA) is a method for finding linear relations between two multidimensional random variables. This paper presents a generalization of the method to more than two variables. The approach is highly scalable, since it scales linearly with respect to the number of training examples and number of views (standard CCA implementations yield cubic complexity). The method i...
متن کاملAdaptive Kernel Canonical Correlation Analysis for Estimation of Task Dynamics from Acoustics
We present a method for acoustic-articulatory inversion whose targets are the abstract tract variables from task dynamic theory. Towards this end we construct a non-linear Hammerstein system whose parameters are updated with adaptive kernel canonical correlation analysis. This approach is notably semi-analytical and applicable to large sets of data. Training behaviour is compared across four ke...
متن کاملApproximate kernel competitive learning
Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate ke...
متن کاملThe Geometry Of Kernel Canonical Correlation Analysis
Canonical correlation analysis (CCA) is a classical multivariate method concerned with describing linear dependencies between sets of variables. After a short exposition of the linear sample CCA problem and its analytical solution, the article proceeds with a detailed characterization of its geometry. Projection operators are used to illustrate the relations between canonical vectors and variat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1511.04773 شماره
صفحات -
تاریخ انتشار 2015